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1 Improved query performance with variant indexes 
Patrick O'Neil, Dalian Quass 



June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 
conference on Management of data, Volume 26 issue 2 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: "|| pdf(1.54 MB) 



The read-mostly environment of data warehousing makes it possible to use more complex 
indexes to speed up queries than in situations where concurrent updates; are present. The 
current paper presents a short review of current indexing technology, including row-set 
representation by Bitmaps, and then introduces two approaches we call Bit-Sliced indexing 
and Projection indexing. A Projection index materializes all values of a column in RID order, 
and a Bit-Sliced index essentially takes an orth ... 



Query processing and optimization in Oracle Rdb 
Gennady Antoshenkov, Mohamed Ziauddin 

December 1996 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 5 Issue 4 
Full text available: ^ pdf(92.62 KB) Additional Information: full citation , abstract , index terms 

This paper contains an overview of the technology used in the query processing and 
optimization component of Oracle Rdb, a relational database management system originally 
developed by Digital Equipment Corporation and now under development by Oracle 
Corporation. Oracle Rdb is a production system that supports the most demanding database 
applications, runs on multiple platforms and in a variety of environments. 

Keywords: Dynamic optimization, Optimizer, Query transformation, Relational database, 
Sampling 



Bit-sliced index arithmetic 

Denis Rinfret, Patrick O'Neil, Elizabeth O'Neil 

May 2001 ACM SIGMOD Record , Proceedings of the 2001 ACM SIGMOD international 

conference on Management of data, Volume 30 issue 2 
Full text available: fgpdf(182.64 KB) Additional Information: fulldtatton , abstract, references , citings, index 

The bit-sliced index (BSI) was originally defined in [ONQ97]. The current paper introduces 
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the concept of BSI arithmetic. For any two BSI's X and Y on a table T, we show how to 
efficiently generate new BSI's Z, V, and W, such that Z = X + Y, V = X - Y, and W = MIN(X, 
Y); this means that if a row r in T has a value x represented in BSI X and a value y in BSI Y, 
the value for r in BSI Z will be x + y, the value in V will be x - y and the value in W will be 
MIN(x, y). Since a bitmap repre ... 

AutoAdmin "what-if index analysis utility 
Surajit Chaudhuri, Vivek Narasayya 

June 1998 ACM SIGMOD Record , Proceedings of the 1998 ACM SIGMOD international 

conference on Management of data, volume 27 issue 2 
Full text available pdf(1 52 MB) Additional Information: full citation ; abstract , references , citings , index 
^ terms 

As databases get widely deployed, it becomes increasingly important to reduce the overhead 
of database administration. An important aspect of data administration that critically 
influences performance is the ability to select indexes for a database. In order to decide the 
right indexes for a database, it is crucial for the database administrator (DBA) to be able to 
perform a quantitative analysis of the existing indexes. Furthermore, the DBA should have 
the ability to propo ... 

5 On supporting containment queries in relational database management systems 
Chun Zhang, Jeffrey Naughton, David DeWitt, Qiong Luo, Guy Lohman 

May 2001 ACM SIGMOD Record , Proceedings of the 2001 ACM SIGMOD international 
conference on Management of data, volume 30 issue 2 

Full text available 1 HI pdf(223 70 KB) Additional Information: full citation , abstract , references , citings, index 
IS- 6 --* ! terms 

Virtually all proposals for querying XML include a class of query we term "containment 
queries". It is also clear that in the foreseeable future, a substantial amount of XML data will 
be stored in relational database systems. This raises the question of how to support these 
containment queries. The inverted list technology that underlies much of Information 
Retrieval is well-suited to these queries, but should we implement this technology (a) in a 
separate loosely-coupled IRengin ... 

6 Query processing: Factorizing complex predicates in queries to exploit indexes 
Surajit Chaudhuri, Prasanna Ganesan, Sunita Sarawagi 

June 2003 Proceedings of the 2003 ACM SIGMOD international conference on on 
Management of data 

Full text available: pdf(240.56 KB) Additional Information: full citation , abstract , references , index terms 

Decision-support applications generate queries with complex predicates. We show how the 
factorization of complex query expressions exposes significant opportunities for exploiting 
available indexes. We also present a novel idea of relaxing predicates in a complex condition 
to create possibilities for factoring. Our algorithms are designed for easy integration with 
existing query optimizers and support multiple optimization levels, providing different trade- 
offs between plan complexity and ... 
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Query Optimization in Database Systems 
Matthias Jarke, Jurgen Koch 

June 1984 ACM Computing Surveys (CSUR), volume 16 issue 2 

Full text available: ^ pdf(2.84 MB) Additional Information: full citation , references , citings , index terms 



XPS a database server for data warehousing 
Andreas Weininger 

November 2001 Proceedings of the 4th ACM international workshop on Data 
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warehousing and OLAP 

Full text available: Qpdf(621.63 KB) Additional Information: full citation , abstract , citings 

A database server used for implementing a data warehouse must support other features 
than a database server used for OLTP. Therefore, in this paper we will look specifically at 
features necessary for efficiently processing queries on a database with a star schema 
model, a database scheme which is used very often in data warehousing. We will especially 
analyze the features provided for this by the IBM Extended Parallel Server (XPS). There are 
special star join methods like the Push-Down Hash Semi ... 

9 Reusing invariants: a new strategy for correlated queries - | 
Jun Rao, Kenneth A. Ross 

June 1998 ACM SIGMOD Record , Proceedings of the 1998 ACM SIGMOD international 

conference on Management of data, volume 27 issue 2 

r „. , i ui « jx/h cc y D \ Additional Information: full citation , abstract , references , citings, index 

Full text available: ml pdfd.55 MB) • a ~' 

l£=y ^ terms 

Correlated queries are very common and important in decision support systems. Traditional 
nested iteration evaluation methods for such queries can be very time consuming. When 
they apply, query rewriting techniques have been shown to be much more efficient. But 
query rewriting is not always possible. When query rewriting does not apply, can we do 
something better than the traditional nested iteration methods? In this paper, we propose a 
new invariant technique to evaluate correlated queries ... 

10 Strategies for processing ad hoc queries on large data warehouses | 
Kurt Stockinger, Kesheng Wu, Arie Shoshani 

November 2002 Proceedings of the 5th ACM international workshop on Data 
Warehousing and OLAP 

Full text available: ^pdf(245.31 KB) Additional Information: full citation , abstract , references , index terms 

As data warehousing applications grow in size, existing data organizations and access 
strategies, such as relational tables and B-tree indexes, are becoming increasingly 
ineffective. The two primary reasons for this are that these datasets involve many attributes 
and the queries on the data usually involve conditions on small subsets of the attributes. 
Two strategies are known to address these difficulties well, namely vertical partitioning and 
bitmap indexes. In this paper, we summarize our exp ... 

11 Data structures for efficient broker implementation | 
Anthony Tomasic, Luis Gravano, Calvin Lue, Peter Schwarz, Laura Haas 

July 1997 ACM Transactions on Information Systems (TOIS), Volume 15 issue 3 

Full text available* f S pdf(316 45 KB) Additional Information: full citation , abstract , references , citings , index 
* ! terms , review 

With the profusion of text databases on the Internet, it is becoming increasingly hard to find 
the most useful databases for a given query. To attack this problem, several existing and 
proposed systems employ brokers to direct user queries, using a local database of summary 
information about the available databases. This summary information must effectively 
distinguish relevant databases and must be compact while allowing efficient access. We offer 
evidence that one broker, GIOSS 

Keywords: GIOSS, broker architecture, broker performance, distributed information, grid 
files, partitioned hashing 



12 Extensions to Starburst: objects, types, functions, and rules 
Guy M. Lohman, Bruce Lindsay, Hamid Pirahesh, K. Bernhard Schiefer 
October 1991 Communications of the ACM, Volume 34 issue 10 
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Keywords: Extended relational database management systems, Starburst, extensible 
database management systems 



13 Performance comparison of property map and bitmap indexing 
Ashima Gupta, Karen C. Davis, Jennifer Grommon-Litton 

November 2002 Proceedings of the 5th ACM international workshop on Data 
Warehousing and OLAP 

Full text available: ^ pdf(250.60 KB) Additional Information: full citation , abstract , references , index terms 

A data warehouse is a collection of data from different sources that supports analytical 
querying. A Bitmap Index (BI) allows fast access to individual attribute values that are 
needed to answer a query by representing the values of an attribute for all tuples 
separately, as bit strings. A Property Map (PMap) is a multidimensional indexing technique 
that pre-computes attribute expressions, called properties, for each tuple and stores the 
results as bit strings [DD97, LD02]. This paper compares t ... 

Keywords: bitmap index, data warehouse, performance study 




14 Similarity queries I: Robust and efficient fuzzy match for online data cleaning 
Surajit Chaudhuri, Kris Ganjam, Venkatesh Ganti, Rajeev Motwani 
June 2003 Proceedings of the 2003 ACM SIGMOD international conference on on 
Management of data 

Full text available: ^pdf(271.47 KB) Additional Information: full citation , abstract , references , index terms 

To ensure high data quality, data warehouses must validate and cleanse incoming data 
tuples from external sources. In many situations, clean tuples must match acceptable tuples 
in reference tables. For example, product name and description fields in a sales record from 
a distributor must match the pre-recorded name and description fields in a product 
reference relation. A significant challenge in such a scenario is to implement an efficient and 
accurate fuzzy match operation that can effec ... 




15 Special issue on prototypes of deductive database systems: The aditi deductive 
database system 

Jayen Vaghani, Kotagiri Ramamohanarao, David B. Kemp, Zoltan Somogyi, Peter J. Stuekey, 
Tim S. Leask, James Harland 

April 1994 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 3 Issue 2 

Full text available: | g| pdf(2.67 MB) Additional Information: full citation , abstract , references , citings 

Deductive databases generalize relational databases by providing support for recursive 
views and non-atomic data. Aditi is a deductive system based on the client-server model; it 
is inherently multi-user and capable of exploiting parallelism on shared-memory 
multiprocessors. The back-end uses relational technology for efficiency in the management 
of disk-based data and uses optimization algorithms especially developed for the bottom-up 
evaluation of logical queri'es involving recursion. The front ... 

Keywords: implementation, logic, multi-user, parallelism, relational database 
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Heuristic optimization of OLAP queries in multidimensionally hierarchically clustered M 
databases 
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Dimitri Theodoratos, Aris Tsois 

November 2001 Proceedings of the 4th ACM international workshop on Data 
warehousing and OLAP 

Full text available: * ^pdf(1.44 MB) Additional Information: full citation , abstract , citings 

On-line analytical processing (OLAP) is a technology that encompasses applications requiring 
a multidimensional and hierarchical view of data. OLAP applications often require fast 
response time to complex grouping/aggregation queries on enormous quantities of data. 
Commercial relational database management systems use mainly multiple one-dimensional 
indexes to process OLAP queries that restrict multiple dimensions. However, in many cases, 
multidimensional access methods outperform one-dimensiona ... 



17 Tutorial: The relational data model for Design Automation 
Mark N. Haynie 

June 1983 Proceedings of the twentieth design automation conference on Design 
automation 

Additional Information: full citation , abstract , references , citings , index 



Full text available: fB pdf(767.27 KB) 

terms 

The relational data model has gained more acceptance in the commercial database 
environment in recent years. It is now finding its way into the Design Automation 
(CAD/CAM) area. This tutorial explains what the relational data model is and how database 
management systems based on it can be used with Design Automation applications. 

18 Data access for the masses through OLE DB 
Jose A. Blakeley 

June 1996 ACM SIGMOD Record , Proceedings of the 1996 ACM SIGMOD international 
conference on Management of data, volume 25 issue 2 

Additional Information: full citation , abstract , references , citings , index 



Full text available: f§pdf(1.24 MB) 

m ^ terms 

This paper presents an overview of OLE DB, a set of interfaces being developed at Microsoft 
whose goal is to enable applications to have uniform access to data stored in DBMS and 
non-DBMS information containers. Applications will be able to take advantage of the benefits 
of database technology without having to transfer data from its place of origin to a DBMS. 
Our approach consists of defining an open, extensible Collection of interfaces that factor and 
encapsulate orthogonal, reusable portions ... 

19 The Quadtree and Related Hierarchical Data Structures 
Hanan Samet 

June 1984 ACM Computing Surveys (CSUR), Volume 16 issue 2 

Full text available: fB pdf(4.87 MB) Additional Information: full citation , references , citings , index terms 



20 Industrial sessions: commercial implementation techniques: A compact B-tree 
Peter Bumbulis, Ivan T. Bowman 

June 2002 Proceedings of the 2002 ACM SIGMOD international conference on 
Management of data 

Full text available: ^ pdf(825.30 KB) Additional Information: full citation , abstract, references , index terms 

In this paper we describe a Patricia tree-based B-tree variant suitable for OLTP. In this 
variant, each page of the B-tree contains a local Patricia tree instead of the usual sorted 
array of keys. It has been implemented in iAnywhere ASA Version 8.0. Preliminary 
experience has shown that these indexes can provide significant space and performance 
benefits over existing ASA indexes. 
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